home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group98a.txt
/
000132_icon-group-sender _Fri Mar 13 12:35:30 1998.msg
< prev
next >
Wrap
Internet Message Format
|
2000-09-20
|
4KB
Return-Path: <icon-group-sender>
Received: from kingfisher.CS.Arizona.EDU (kingfisher.CS.Arizona.EDU [192.12.69.239])
by baskerville.CS.Arizona.EDU (8.8.7/8.8.7) with SMTP id MAA14427
for <icon-group-addresses@baskerville.CS.Arizona.EDU>; Fri, 13 Mar 1998 12:35:30 -0700 (MST)
Received: by kingfisher.CS.Arizona.EDU (5.65v4.0/1.1.8.2/08Nov94-0446PM)
id AA17660; Fri, 13 Mar 1998 12:35:29 -0700
Message-Id: <3509869A.6F03@gte.net>
Date: Fri, 13 Mar 1998 13:18:50 -0600
From: Mark Evans <evans@gte.net>
Reply-To: evans@gte.net
Organization: None
X-Mailer: Mozilla 3.01 (Win95; I)
Mime-Version: 1.0
To: icon-group@optima.CS.Arizona.EDU
Subject: Re: Letter Probabilities
References: <199803131831.KAA01976@sims-rd.corp.cirrus.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Errors-To: icon-group-errors@optima.CS.Arizona.EDU
Status: RO
Content-Length: 3003
Eka,
Thank you for taking time to reply so thoughtfully. I thought of this
technique before making my post, and have two comments about it. It is
a valid approach.
(1) Some letters occur with vanishingly small, but nonzero,
probabilities. In order to handle them, the generator string would have
to be extremely long just to have the letter occur once or twice in the
string! We are talking about several thousand characters in a string.
(2) There is a more elegant Icon syntax for producing a random character
from a string, namely ?string. I assume that this operator assumes
equal probability for each letter in the string, namely 1/N where N =
*string.
What you propose is valid and I will think it through once more. When I
first rejected the idea, it was because I had the impression that Icon
could not handle strings of more that 255 characters. That appears not
to be the case.
FYI, I did implement the while-loop mentioned earlier, and it is
reasonably fast. In fact I am impressed overall with the speed of Icon
as an interpreted language for many of these inner nested loops.
I recall reading in one of the online archives about problems with
Icon's built-in random number generator. Such problems would affect
random string access with ?. Because I am concerned with statistical
behaviors, I need to know if anyone has any information about these
issues, or whether they have been solved.
Random number generators can be funny things. I once worked for a
former VP of reserach at a big aerospace concern. He told me a story.
There was once an argument over the merits of a particular FORTRAN
random number generator used in Monte Carlo simulations and
integrations. The author of this module produced all kinds of 2D and 3D
scatter plots showing its random characteristics. It would always
appear to fill the space.
The man harboring suspicions ended the argument with a single plot of
his own. He had the generator produce long sequences of (x,y,z)
coordinates to make another 3D scatter-plot. This time, however, he
changed the viewpoint so that the true nature of the algorithm became
apparent to all.
All the points were on a 2D plane embedded in 3-space. You had to see
the plane edge-on to observe this in the scatter plot. Funny things can
happen.
Best regards,
Mark
Eka Laiman wrote:
>
> Mark Evans wrote:
> > Here is a small Icon problem related to letter probabilites.
>
> This is what I would do:
>
> (1) Construct a string of length n where the members of the string
> are letters and the number of occurrence of each letter is
> according to the probability of letter occurrence
>
> Floyd's algorithm of generating random permutation from 1 to n:
>
> (1) Fill an array L such that L[i] := i
> (2) every i := n to 2 by -1 do {
> t := ?i # generate a random index between 1 to i
> L[i] :=: L[t] # exchange the t-th element with the last
> # element in current sublist
> }
>